Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Replace validator's xss() with caja's sanitize() #1633

Closed
wants to merge 1 commit into from
Closed

[WIP] Replace validator's xss() with caja's sanitize() #1633

wants to merge 1 commit into from

Conversation

jgillich
Copy link
Contributor

@jgillich jgillich commented Dec 8, 2013

This is not yet ready (tests fail for some reason), just putting it here for feedback.

As discussed in #1378, this replaces node-validator's xss filter with the one from caja.

When you look at the changes, you will notice that this does not escape or replace the elements, instead it removes them entirely. This means that if you enter something like <script></script> in the title, you get told that an empty value isn't allowed. Probably not the best way to handle this.

I did some digging and as far as I can see, escaping would require us to modify caja heavily. What we could easily do is white/blacklisting of elements & attributes (sanitize() uses a internal list of elements that it considers unsafe, probably fine for our use case), but escaping instead of removing them is not really what it's been made for.

@javorszky
Copy link
Contributor

Will this work for example with themes that allow code snippets in the content? Tutorial, here's how you put some javascript into your html page:

<script>
(function ($) {
    $(document).ready(function () {
        alert('Yay');
    });
}(jQuery));
</script>

Would all of that be escaped into html entites and left in, or would all of that be stripped out entirely?

@jgillich
Copy link
Contributor Author

jgillich commented Dec 8, 2013

It depends; if it's sanitized, then it would be stripped out entirely. But the post content is not sanitized at all currently. Handlebars may still escape it afterwards.

@jgillich
Copy link
Contributor Author

Everything done, feel free to merge. A few notes:

  • In case sanitize removes the entire title, the error message is Post title cannot be blank (current text is classified as unsafe and will be removed). If it's not ending up empty, unsafe html is silently removed. What I would prefer is showing a warning whenever something has been removed, but as far as I can see there is no way to trigger a notification from a model -> probably lots of refactoring neccessary.
  • It's crazy how often sanitize is getting called. Most times it makes no sense because a) value is null or undefined or b) value is getting validated by node-validator anyway. sanitize should be used for values that allow HTML; we might need other filters for other value types (?).
  • permittedAttributes were already filtered by the base model, thus I removed that line.

@ErisDS
Copy link
Member

ErisDS commented Dec 16, 2013

I haven't look at this PR yet. Not sure whether it is a good idea to put this in 0.4 or not, will look and make a decision in the next day or so.

@jgillich
Copy link
Contributor Author

Up to you; The user feedback is not ideal currently, but the sanitizer should work fine.

@ErisDS
Copy link
Member

ErisDS commented Feb 18, 2014

Thought I'd come and update my thoughts on this.
Effectively, nanaging XSS is going to be one of the largest and most complex parts of 0.6. We're going to want to protect users from one another as much as possible.

To do it well I think we need to be able to give proper user feedback, so unless there have been developments in caja, unfortunately I think it may be a dead end?

@jgillich
Copy link
Contributor Author

I don't think it's related to caja, the sanitize function simply returns the safe html. The question is how to tell the user "we've just deleted your entire title because you put <script>alert('JS rocks!')</script> there and now we can't save your post".

My suggestion would be:

  • Disable (=escape) HTML for titles, support markdown instead. It's far more likely somebody wants to have <script> in a title than **.
  • Use caja for the post text. Since there is an instant preview, the user will notice when something had to be removed.
  • Re-evaluate the way the base model's sanitize function is being used. Most (HTML-disabled) text fields simply need to be escaped.

@ErisDS
Copy link
Member

ErisDS commented Feb 19, 2014

@jgillich with the exception of the second part about the editor / preview this sounds like a sensible approach.

The problem with the editor is that we do want to support <script> tags for certain purposes. I think there are a couple of ways of approaching this:

  • having script support as a permission which admins get by default
  • only executing the script for the user who created it (@sebgie was talking about this the other day and had some interesting ideas - be good to share those here?)
  • having a content security policy and whitelisting trusted services like github, twitter, facebook etc.

We possibly need a combination of some or all of the above?

@sebgie
Copy link
Contributor

sebgie commented Feb 22, 2014

I think that we need a combination of escaping (as described by @jgillich) and disabling JavaScript to make the Ghost admin safe for multi-user support.

  • escape all fields (title, tags, ...) that are not allowed to contain JS/HTML (lodashs _.escape() should work for that)
  • We want a user to be allowed to insert JavaScript in the editor. Therefore I think that it would make sense to handle included JavaScript as follows:
    • JS is executed on the frontend.
    • JS should not be executed in the editor, but I think that we could present a warning, that JavaScript is disabled (removed from the code) and ask the user to confirm that he is aware of the risk and wants to execute it anyway. It's the same concept as downloading an App from the internet and OSX is asking you if you really would like to open it.
    • My initial idea of not disabling JavaScript for the creator is flawed since another editor could add malicious code that would be executed in the context of the creator 💣.

@sebgie
Copy link
Contributor

sebgie commented Feb 28, 2014

@jgillich Is there a special reason why you published your own wrapper for Google-Caja? I'm just curious if there is something wrong with the other wrappers.

I had two thoughts that I would like to ask here:

  • How are we going to distinguish between JS that is executed and JS that is embedded as a code snippet?
  • I'm not sure anymore if it is really necessary to execute JavaScript in the preview window? It is okay to execute JS on the front end but I don't think that the preview is the right place to test your JS implementation given that the context (theme, surrounding elements, JS libraries) differs vastly from the blog page?

@jgillich
Copy link
Contributor Author

The other two wrappers I've checked both included a hacky workaround: https://github.com/theSmaw/Caja-HTML-Sanitizer/blob/master/sanitizer.js#L1083
I also mentioned that in #1378 (comment)

How are we going to distinguish between JS that is executed and JS that is embedded as a code snippet?

I think the markdown converter should run first, which would mean code snippets are already escaped when the content is sanitized.

@sheerun
Copy link

sheerun commented Mar 1, 2014

+1 for enabling javascript only on frontend, and using caja sanitizer for removing scripts on admin

@ErisDS
Copy link
Member

ErisDS commented Mar 1, 2014

There are many use cases where executing scripts in the admin would be valid, and various different considerations for what we should allow and where.

Firstly, the admin has two places where scripts in posts could potentially be executed:

  • the editor split screen
  • the content screen preview

Secondly, there are, as I see it, 2 different types of executable script, external sources (i.e. embeds), and inline/custom script blocks.

Right now, neither kind of script are executed on the editor, but on the content screen, embeds don't work but script blocks do:

Example on the editor:
editor

Example on the content screen (note the hello in the corner):
content

But in this case, I think it makes more sense that embedding a gist would work, and that writing some random code shouldn't.

So perhaps we need to use content security policies to build up a trusted list of embed-able scripts, and disable everything else in the admin?

However, if I'm the administrator of my blog... why shouldn't I be allowed to execute scripts? Perhaps it should be enabled/disabled on a roles basis?

@sebgie
Copy link
Contributor

sebgie commented Mar 3, 2014

However, if I'm the administrator of my blog... why shouldn't I be allowed to execute scripts? Perhaps it should be enabled/disabled on a roles basis?

That depends on your trust in your fellow Ghost users. Since we have no editing history we're unable to tell if someone else has edited a post and disable scripts accordingly. An attack scenario would be that another user adds a script to your blog post which is executed with admin rights when the administrator opens the post. This could be used to install apps or escalate privileges.

I think the markdown converter should run first, which would mean code snippets are already escaped when the content is sanitized.

You are absolutely right, that will work.

@sebgie
Copy link
Contributor

sebgie commented Mar 27, 2014

This conversation has come to a halt again. I think that this issue is something we should tackle with milestone 0.6? We can leave this PR open and add functionality when we have a final concept of how to handle XSS?

@ErisDS
Copy link
Member

ErisDS commented Apr 3, 2014

XSS is a big deal, and is going to be a primary focus of 0.6, so I agree, lets leave this PR and tackle this problem head on when we have more time for it in 0.6.

@ErisDS
Copy link
Member

ErisDS commented May 25, 2014

I'm going to close this now as we are working on adding escaping and also a CSP in order to tackle XSS problems in the admin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants